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Abstract. We present a combinatorial interpretation of Berkowitz's algorithm. Berkowitz's 
algorithm, defined in [Q, is the fastest known parallel algorithm for computing the characteristic 
polynomial of a matrix. Our combinatorial interpretation is based on "loop covers" introduced 
by Valiant in m, and "clow sequences," defined in Gj. Clow sequences turn out to capture very 
succinctly the computations performed by Berkowitz^ algorithm, which otherwise is quite difficult 
to analyze. The main contribution of th is paper is a proof of correctness of Berkowitz's algorithm in 



terms of clow sequences (Theorem 3.11) 
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1. Introduction. Berkowitz's algorithm is the fastest known parallel algorithm 
for computing the characteristic polynomial (char poly) of a matrix (and hence for 
computing the determinant, the adjoint, and the inverse of a matrix, if it exists). 
It can be formalized with small boolean circuits of depth 0(log 2 ) in the size of the 
underlying matrix. We shall describe precisely in the next section what we mean 
by "small" and "depth," but the idea is that the circuits have polynomially many 
gates in n, for an n x n matrix A, and the longest path in these circuits is a constant 
multiple of log 2 n. 

There are two other fast parallel algorithms for computing the coefficients of the 
characteristic polynomial of a matrix: Chistov's algorithm and Csanky's algorithm. 
Chistov's algorithm is more difficult to formalize, and Csanky's algorithm works only 
for fields of char 0; see || section 13.4] for all the details about these two algorithms. 

The author's original motivation for studying Berkowitz's algorithm was the proof 
complexity of linear algebra. Proof Complexity deals with the complexity of formal 
mathematical derivations, and it has applications in lower bounds and automated 
theorem proving. In particular, the author was interested in the complexity of deriva- 
tions of matrix identities such as AB = I — * BA = I (right matrix inverses are left 
inverses). These identities have been proposed by Cook as candidates for separating 
the Frege and Extended Frege proof systems. Proving (or disproving) this separation 
is one of the outstanding problems in propositional proof complexity (see || for a 
comprehensive exposition of this area of research) . 

Thus, we were interested in an algorithm that could compute inverses of matrices, 
of the lowest complexity possible. Berkowitz's algorithm is ideal for our purposes for 
several reasons: 

• as was mentioned above, it has the lowest known complexity for computing 
the char poly (we show in the next section that it can be formalized with 
uniform NC 2 circuits: small circuits of small depth), 
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• it can be easily expressed with iterated matrix products, and hence it lends 
itself to an easy formalization in first order logic (with three sorts: indices, 
field elements, and matrices, see Q), 

• and it is field independent, as the algorithm does not require divisions, and 
hence Berkowitz's algorithm can compute char polynomials over any commu- 
tative ring. 

Standard algorithms in linear algebra, such as Gaussian Elimination, do not yield 
themselves to parallel computations. Gaussian Elimination is a sequential polynomial 
time algorithm, and hence it falls in a complexity class far above the complexity of 
Berkowitz's algorithm. Furthermore, Gaussian Elimination requires divisions, which 
are messy to formalize, and are not "field independent" . The cofactor expansion 
requires computing n! many terms, so it is an exponential algorithm, and hence not 
tractable. 

From the beginning, we were interested in proving the correctness of Berkowitz's 
algorithm within its own complexity class. That is, for our applications, we wanted to 
give a proof of correctness where the computations were not outside the complexity 
class of Berkowitz's algorithm, but rather inside NC 2 , meaning that the proof of 
correctness should use iterated matrix products as its main engine for computations. 
This turned out to be a very difficult problem. 

The original proof of correctness of Berkowitz's algorithm relies on Samuelson's 
Identity (shown in the next section), which in turn relies on Lagrange's Expansion, 
which is widely infeasible (as it requires summing up n! terms, for a matrix of size 
n x n). We managed to give a feasible (polytime) proof of correctness in [|], but the 
hope is that it is possible to give a proof of correctness which does not need polytime 
concepts, but rather concepts from the class NC 2 . Note that by correctness, we 
mean that we can prove the main properties of the char poly, which are: the Cayley- 
Hamilton Theorem, and the multiplicativity of the determinant; all other "universal" 
properties follows directly from these two. 

Hence our interest in understanding the workings of Berkowitz's algorithm. In 
this paper, we show that Berkowitz's algorithm computes sums of the so called "clow 
sequences." These are generalized permutations, and they seem to be the conceptu- 
ally cleanest way of showing what is going on in Berkowitz's algorithm. Since clow 
sequences are generalized permutations, they do not lend themselves directly to a fea- 
sible proof. However, the hope is that by understanding the complicated cancellations 
of terms that take place in Berkowitz's algorithm, we will be able to assert properties 
of Berkowitz's algorithm which do have NC 2 proofs, and which imply the correctness 
of the algorithm. Clow sequences expose very concisely the cancellations of terms in 
Berkowitz's algorithm. 



The main contribution of this paper is given by Theorem 3.11, where we show 
that Berkowitz's algorithm computes sums of clow sequences. The first combinatorial 
interpretation of Berkowitz's algorithm was given by Valiant in 0, and it was given 
in terms of "loop covers," which are similar to clow sequences. This was more of an 
observation, however, and not many details were given. We give a detailed inductive 
proof of correctness of Berkowitz's algorithm in terms of clow sequences, introduced 



2. Berkowitz's Algorithm. Bcrkowitz's algorithm computes the coefficients 
of the char polynomial of a matrix A, pa{x) — det(x/ — A), by computing iterated 
matrix products, and hence it can be formalized in the complexity class NC 2 . 

The complexity class NC 2 is the class of problems (parametrized by n — here n is 
the input size parameter) that can be computed with uniform boolean circuits (with 
gates AND, OR, and NOT), of polynomial size in n (i.e., polynomially many gates in 
the input size), and C(log 2 n) depth (i.e., the longest path from an input gate to the 
circuit to the output gate is in the order of log 2 n) . 

For example, matrix powering is known to be in NC 2 . The reason is that the 
product of two matrices can be computed with boolean circuits of polynomial size 
and logarithmic depth (i.e., in NC 1 ), and the n-th power of a matrix can be obtained 
by repeated squaring (squaring log n many times for a matrix of size n x n). Cook 
defined^] the complexity class POW to be the class of problems reducible to matrix 
powering, and showed that NC 1 C POW C NC 2 . Note that every time we make 
the claim that Berkowitz's algorithm can be formalized in the class NC 2 , we could 
be making a stronger claim instead by saying that Berkowitz's algorithm can be 
formalized in the class POW. 

Berkowitz's algorithm computes the char polynomial of a matrix with iterated 
matrix products. Iterated matrix products can easily be reduced to the problem of 
matrix powering: place the A\, A2, . . . , A n above the main diagonal of a new ma- 
trix B which is zero everywhere else, compute B n , and extract A\Ai---A n from 
the upper-right corner block of B n . Hence, since Berkowitz's algorithm can be com- 
puted with iterated matrix products (as we show in Definition |2.6| below), it follows 
that Berkowitz's algorithm can be formalized in POW C NC . The details are in 



Lemma 2.7 below. 

The main idea in the standard proof of Berkowitz's algorithm (see 0) is Samuel- 
son's identity, which relates the char polynomial of a matrix to the char polynomial 
of its principal sub-matrix. Thus, the coefficients of the char polynomial of an n x n 
matrix A below are computed in terms of the coefficients of the char polynomial of 
M : 



A : 



an R 
S M 



where R, S and M are 1 X (n - 1), (n — 1) X 1 and (n — 1) X (n — 1) sub-matrices, 
respectively. 

Lemma 2.1 (Samuelson's Identity). Let p(x) and q(x) be the char polynomials 
of A and M, respectively. Then: 

p(x) = (x- a n )q(x) - R ■ &dj(xl - M) ■ S 



Recall that the adjoint of a matrix A is the transpose of the matrix of cofactors 
of A] that is, the (i,j)-the entry of adj(A) is given by (— l) 4 " 1 ^ det(A[j|i]). Also recall 

1 See (|S) for a comprehensive exposition of the parallel classes NC 8 , POW, and related complexity 
classes. This paper contains the details of all the definitions outlined in the above paragraphs. 
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that A[fc|Z] is the matrix obtained from A by deleting the k-th row and the Z-th column. 
We also make up the following notation: A[— \l] denotes that only the Z-th column 
has been deleted. Similarly, j4[fc|— ] denotes that only the A:-th row has been deleted, 
and A[-\-] = A. 



Proof. (Lemma 2.1) 



p(x) = det(xl - A) 
= det 



x — an —R 
-S xl - M 



using the cofactor expansion along the first row: 



(x - an) det (a/ - M) + V(-l) J (-r.,) det(~S(xI - M)[-\j]) 

v ' 

(*) 



where R = (7*1X2 ■ ■ -?Vi-i), and the matrix indicated by (*) is given as follows: the 
first column is S, and the remaining columns are given by (xl — M) with the j-th 
column deleted. We expand det(— S(xl — M)[— along the first column, i.e., along 
the column S — (S1S2 . . . s n -i) T to obtain: 

n — 1 n—1 

= (x a n )q(x) + £(-l)i(- rj ) J2(-l) i+1 (si) det(xl - M)[i\j] 
j=i i=i 



and rearranging: 



n-i I n-l \ 

= (x - au)q(x) - R ■ adj(:rJ - M) ■ S 
and we are done. □ 

Lemma 2.2. Let q(x) = qn-ix 71 ^ 1 + • • • + q\x + qo be the char polynomial of M, 
and let: 

n 

B(x) = ^(g„_iA/ fc - 2 + • • • + q n - k+ il)x n ~ k (2.1) 

fe=2 

Then B(x) = &dj(xl - M). 

Example 2.3. If n = 4, then 

B(x) = l g3 a; 2 + {Mq 3 + Iq 2 )x + {M 2 q 3 + Mq 2 + I qi ) 
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Proof. (Lemma |2.2| ) First note that: 



adj(xl - M) ■ [xl - M) = det(xl - M)I = q{x)I 



Now multiply B(x) by (xl — M), and using the Cay ley-Hamilton Theorem, we can 
conclude that B(x) ■ (xl — M) = q(x)I. Thus, the result follows as q(x) is not the 
zero polynomial; i.e., (xl — M) is not singular. □ 

From Lemma 2.1 and Lemma |2.2| we have the following identity which is the basis 
for Berkowitz's algorithm: 



p(x) = (x — cin)q(x) — R ■ B(x) ■ S 



(2.2) 



Using ( |2.2| ), we can express the char poly of a matrix as iterated matrix product. 
Again, suppose that A is of the form: 



an 
S 



R 
M 



Definition 2.4. We say that an n x to matrix is Toeplitz if the values on each 
diagonal are the same. We say that a matrix is upper triangular if all the values below 
the main diagonal are zero. A matrix is lower triangular if all the values above the 
main diagonal are zero. 



If we express equation (^2) in matrix form we obtain: 

p = Ciq 



(2.3) 



where C\ is an (n + 1) x n Toeplitz lower triangular matrix, and where the entries in 
the first column are defined as follows: 



en = 



-an 

-(RM^S) 



a i = i 

Mi = 2 
if i > 3 



(2.4) 



Example 2.5. IfAisa4x4 matrix, then p = C\q is given by: 



f P4 \ 




1 











\ 


P3 




-an 


1 










P2 




-RS 


-an 


1 







Pi 




-RMS 


-RS 


-an 


1 




\P0 J 


V 


-RM 2 S 


-RMS 


-RS 


-an 


/ 

















V 10 J 



Berkowitz's algorithm consists in repeating this for q (i.e., q can itself be computed 
as q = C%t, where r is the char polynomial of M[l|l]), and so on, and eventually 
expressing p as a product of matrices: 

P = C 1 C 2 ---C n 

We provide the details in the next definition. 
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Definition 2.6 (Berkowitz's algorithm). Given an n x n matrix A, over any 
field K, Berkowitz's algorithms computes an (n+1) X 1 column vector pa as follows: 

Let Cj be an (n + 2 — j) X (?! + 1 — j) Tocplitz and lower-triangular matrix, where 
the entries in the first column are define as follows: 

' f if i = 1 



Hi = 2 (2.5) 



-RjM]- 6 Sj if3<i<n + 2-j 

where Mj is the j-th principal sub-matrix, so Mi = j4[1| 1] , M-i = Mi[l|l], and in 
general Mj+\ = Mj[l|l], and Rj and Sj are given by: 

( a i(i+i) a j0"+2) ■■■ a jn ) and ( a (j+1)j a {j+2)j ... a nj ) 
respectively. Then: 

2M = C-iC2 •••<?„ (2. 6 ) 

Note that Berkowitz's algorithm is field independent (there are no divisions in the 
computation of pa), and so all our results are field independent. 
Lemma 2.7. Berkowitz's algorithm is an NC 2 algorithm. 



Proof. This follows from (2.6): pa is given as a product of matrices, each Ci can 
be computed independently of the other Cj's, so we have a sequence of C\, C2, ■ ■ ■ ,C n 
matrices, independently computed, so we can compute their product with repeated 
squaring of the matrix B, which is constructed by placing the Ci's above the main 
diagonal of an otherwise all zero matrix. 

Now the entries of each Ci can also be computed using matrix products, again 
independently of each oth er. In fact, we can compute the (i,j)-th entry of the fc-th 
matrix very quickly as in (|2.5| ). 

Finally, we can compute additions, additive inverses, and products of the un- 
derlying field elements (in fact, more generally, of the elements in the underlying 
commutative ring, as we do not need divisions in this algorithm). We claim that 
these operations can be done with small NC 1 circuits (this is certainly true for the 
standard examples: finite fields, rationals, integers, etc.). 

Thus we have "three layers" : one layer of NC 1 circuits, and two layers of NC 2 
circuits (one layer for computing the entries of the Cj's, and another layer for com- 
puting the product of the Cj's), and so we have (very uniform) NC 2 circuits that 
compute the a column vector with the coefficients of the char polynomial of a given 
matrix. □ 

In this section we showed that Berkowitz's algorithm computes the coefficients of 
the char polynomial correctly, by first proving Samuelson's Identity, and then using 



the Cayley-Hamilton Theorem, to finally obtain that equation (2.6) computes the 
coefficients of the char polynomial correctly. This is a very indirect approach, and we 
loose insight into what is actually being computed when we are presented with equa- 



tion (2.6). However, the underlying fact is the Lagrange expansion of the determinant 



(that's how Samuelson's Identity, and the Cayley-Hamilton Theorem are proved). In 



the next section, we take equation (2.6) naively, and we give a combinatorial proof of 



its correctness with clow sequences and the Lagrange expansion. 



3. Clow Sequences. First of all, a "clow" is an acronym for "closed walk." 
Clow sequences (introduced in B , based on ideas in [0]) , can be thought of as gen- 
eralized permutations. They provide a very good insight into what is actually being 
computed in Bcrkowitz's algorithm. 

In the last section, we derived Bcrkowitz's algorithm from Samuelson's Identity 
and the Cayley-Hamilton Theorem. However, both these principles are in turn proved 
using Lagrange's expansion for the determinant. Thus, this proof of correctness of 
Bcrkowitz's algorithm is indirect, and it does not really show what is being computed 
in order to obtain the char polynomial. 

To see what is being computed in Berkowitz's algorithm, and to understand 
the subtle cancellations of terms, it is useful to look at the coefficients of the char 
polynomial of the determinant of a matrix A as given by determinants of minors of A. 
To define this notion precisely, let A be an n x n matrix, and define A[i%, . . . , where 
1 < H < i% < • • • < ik < n i to be the matrix obtained from A by deleting the rows 
and columns numbered by i\, ■ ■ ■ , ik- Thus, using this notation, A[l|l] = A[l], and 
A[2, 3, 8] would be the matrix obtained from A by deleting rows and columns 2, 3, 8. 

Now, it is not difficult to show from the Lagrange's expansion of det(xJ — A) that 
p n -k, where p n ,p n -i, ... ,po are the coefficients of the char polynomial of A, is given 
by the following formula: 

Pk = ^2 det(A[i 1 ,i 2 , ■ ■ ■ ,*fe]) (3.1) 

l<ii<«2 <•••<»&<« 

Since det(A[ii, 12, ■ ■ ■ >*k]) can De computed using the Lagrange's expansion, it follows 
from ( |3.l| ) , that each coefficient of the char polynomial can be computed by summing 
over permutations of minors of A: 

Pn-k= Yl si S^( a ) a h<TUi) a h^(h) ■ ■ ■ a ik°<J*) ( 3 - 2 ) 

l<U<i 2 <---<i„-fc<n a£{jx,32,— ,3k} 

Note that the set {ji,j2,--- ,jk} is understood to be the complement of the set 
«2) • • • j in-k} in {1, 2, . . . , n}, and a is a permutation in Sk, that permutes the 
elements in the set {ji,j2, ■ ■ ■ ,jk}- We re- arranged the subscripts in ( |3.2| ) to make 
the expression more compatible later with clow sequences. Note that when k = n, 
we are simply computing the determinant, since in that case the first summation is 
empty, and the second summation spans over all the permutations in S n '- 

det(A) = Y signOW^l) 1 ' ' a na(n) 

Finally, note that if k — 0, then the second summation is empty, and there is just one 
sequence that satisfies the condition of the first summation (namely 1 < 2 < • • • < n) , 
so the result is 1 by convention. 

We can interpret a £ S n as a directed graph G a on n vertices: if a(i) = j, then 
is an edge in G CT , and if a(i) = i, then G a has the self-loop (i, i). 

Example 3.1. The permutation given by: 

_/l23456\ 
<7_ \ < 312465^ 
7 



corresponds to the directed graph G a with 6 nodes and the following edges: 



{(1,3), (2,1), (3,2), (4,4), (5,6), (6,5)} 



where (4,4) is a self-loop. 

Given a matrix A, define t he w eight of G a , w(G a ), as the product ofa^'ssuch that 
(i, j) G G CT . So G a in example 3T has a weight given b y: w (G a ) = 013021032044056065. 
Using the new terminology, we can restate equation (3^2) as follows: 



Pn-k = Y sign(a)w(G a ) 

l<ii<i 2 <— <i n -k<nae{ji,j 2 ,— ,jk} 



(3.2') 



The graph-theoretic interpretation of permutations gets us closer to clow se- 



quences. The problem that we have with (3.2') is that there are too many per- 
mutations, and there is no (known) way of grouping or factoring them, in such a way 
so that we can save computing all the terms w(a), or at least so that these terms 
cancel each other out as we go. 

The way to get around this problem is by generalizing the notion of permutation 
(or cycle cover, as permutations are called in the context of the graph-theoretic inter- 
pretation of a). Instead of summing over cycle covers, we sum over clow sequences; 
the paradox is that there are many more clow sequences than cycle covers, but we can 
efficiently compute the sums of clow sequences (with Berkowitz's algorithm), mak- 
ing a clever use of cancellations of terms as we go along. We now introduce all the 
necessary definitions, following 

Definition 3.2. A clow is a walk (w±,... ,wi) starting from vertex wi and 
ending at the same vertex, where any (w i; w i+1 ) is an edge in the graph. Vertex W\ is 
the least-numbered vertex in the clow, and it is called the head of the clow. We also 
require that the head occur only once in the clow. This means that there is exactly 
one incoming edge {wi,w\), and one outgoing edge (wi,w%) at w\, and Wi ^ W\ for 
i ^ 1. The length of a clow (wi, ... ,wi) is I. Note that clows are not allowed to be 
empty, since they always must have a head. 

Example 3.3. Consider the clow C given by (1,2,3,2,3) on four vertices. The 
head of clow C is vertex 1, and the length of C is 6. 

Definition 3.4. A clow sequence is a sequence of clows (Ci,... ,Ck), where 
head(Ci) < . . . < head(Cfc). The length of a clow sequence is the sum of the lengths 
of the clows (i.e., the total number of edges, counting multiplicities). Note that a 
cycle cover is a special type of a clow sequence. 

Definition 3.5. We define the sign of a clow sequence to be (— l) fe where k is 
the number of clows in the sequence. 

Example 3.6. We list the clow sequences associated with the three vertices 
{1,2,3}. We give the sign of the corresponding clow sequences in the right-most 



column: 



1. 


(1),(2),(3) 


(-1 




-3 


= 1 


2. 


(1,2), (3) 


(-1 


3H 


-2 


= -1 


3. 


(1,2,2) 


(-1 


3H 


-1 


= 1 


4. 


(1,2), (2) 


(-1 




-2 


= -1 


5. 


(1),(2,3) 


(-1 


3H 


-2 


= -1 


G. 


(1,2,3) 


(-1 




-1 


= 1 


7. 


(1,3,3) 


(-1 


3H 


-1 


= 1 


8. 


(1,3), (3) 


(-1 


:i - 


-2 


= -1 


9. 


(1,3,2) 


(-1 


3H 


-1 


= 1 


10. 


(1,3), (2) 


(-1 


3H 


-2 


= -1 


11. 


(2,3,3) 


(-1 


3H 


-1 


= 1 


12. 


(2, 3), (3) 


(-1 


3H 


-2 


= -1 



Note that the number of permutations on 3 vertices is 3! = 6, and indeed, the clow 
sequences {3,4,7,8,11,12} do not correspond to cycle covers. We listed these clow 
sequences which do not correspond to cycle covers by pairs: {3, 4}, {7, 8}, {11, 12}. 
Consider the first pair: {3,4}. We will later define the weight of a clow (simply 
the product of the labels of the edges), but notice that clow sequence 3 corresponds 
to a\2&22 a 2\ an d clow sequence 4 corresponds to 012021022, which is the same value; 
however, they have opposite signs, so they cancel each other out. Same for pairs {7, 8} 
and {11, 12}. We mak e this informal observation precise with the following definitions, 
and in Theorem 3.1C we show that clow sequences which do not correspond to cycle 
covers cancel out. 

Given a matrix A, we associate a weight with a clow sequence that is consistent 
with the contribution of a cycle cover. Note that we can talk about clows and clow 
sequences independently of a matrix, but once we associate weights with clows, we 
have to specify the underlying matrix, in order to label the edges. Thus, to make 
things more precise, we will sometimes say "clow sequences on A" to emphasize that 
the weights come from A. 

Definition 3.7. Given a matrix A, the weight of a clow C, denoted w(C), is 
the product of the weights of the edges in the clow, where edge (i, j) has weight Oy. 

Example 3.8. Given a matrix A, the weight of clow C in example 3.3 is given 

by: 

w((l, 2, 3, 2, 3)) = 012033032031 



Definition 3.9. Given a matrix A, the weight of a clow sequence C, denoted 
w(C), is the product of the weights of the clows in C. Thus, if C — (Ci, . . . , Cfc), 
then: 



j(C)=l[w(C t 



We make the convention that an empty clow sequence has weight 1. Since a clow 
must consist of at least one vertex, a clow sequence is empty iff it has length zero. 
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Thus, equivalently, a clow sequence of length zero has weight 1. These statements 
will be important when we link clow sequences with Berkowitz's algorithm. 

Theorem 3.10. Let A be an n x n matrix, and let p n ,p n -x, . . . ,po be the 
coefficients of the char polynomial of A given by det(a;/ — A). Then: 

p n - k = ^sign(C)w(C) (3.3) 



Where Ck — {C\C is a clow sequence on A of length k}. 

Proof. We generalize the proof given in ||, pp. 5-8] for the case k = n. The main 
idea in the proo f is that clow sequences which are not cycle covers cancel out, just as 
in example 



3.6, so the contribution of clow sequences which are not cycles covers is 
zero. 

Suppose that (Ci , . . . , Cj) is a clow sequence in A of length k. Choose the smallest 
i such that (Cj+i, . . . , Cj) is a set of disjoint cycles. If i = 0, (Ci, ... , Cj) is a cycle 
cover. Otherwise, if i > 0, we have a clow sequence which is not a cycle cover, so we 
show how to find another clow sequence (which is also not a cycle cover) of the same 
weight and length, but opposite sign. The contribution of this pair to the summation 
in J3.3| ) will be zero. 

So suppose that i > 0, and traverse Ci starting from the head until one of two 
possibilities happens: (i) we hit a vertex that is in (Cj+i, . . . ,Cj), or (ii) we hit a 
vertex that completes a simple cycle in Ci. Denote this vertex by v. In case (i), let 
C p be the intersected clow (p > i + 1), join d and C p at v (so we merge Ci and C p ). 
In case (ii), let C be the simple cycle containing v. detach it from Ci to get a new 
clow. 

In either case, we created a new clow sequence, of opposite sign but same weight 
and same length k. Furthermore, the new clow sequence is still not a cycle cover, and 
if we would apply the above procedure to the new clow sequence, we would get back 
the original clow sequence (hence our procedure defines an involution on the set of 
clow sequences). □ 

In Valiant points out that Berkowitz's algorithm computes sums of what he 
calls "loop covers." We show that Berkowitz's algorithm computes sums of slightly 
restricted clow sequences, which are nevertheless equal to the sums of all clow se- 



quences, and therefore, by Theorem 3.10| , Berkowitz's algorithm computes the coeffi- 



cients Pn-k of the char polynomial of A correctly. We formalize this argument in the 
next theorem, which is the central result of this paper. 
Theorem 3.11. Let A be an n x n matrix, and let: 

( Pn \ 

Pn-l 



PA 



\ PO J 



as in defined by equation 2.6; that is, pa is the result of running Berkowitz's algorithm 
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on A. Then, for < i < n, we have: 



p„- t = ^sign(C)w(C) 

Ci 



(3.4) 



where d — {C\C is a clow sequence on A of length i}. 
Before we prove this theorem, we give an example. 

Example 3.12. Suppose that A is a 3 x 3 matrix, M = A[l\l] as usual, and 
P3,P2,Pi,Po are the coefficients of the char poly of A and (72, Qi, qo are the coefficients 
of the char poly or M, computed by Berkowitz's algorithm. Thus: 



f Pa \ 
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P2 




-an 
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( 92 


Pi 




-RS 


-an 
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\P0 J 




\ -RMS 


-RS 


-an ) 





92 

-0*1142 



(3.5) 



<h 



-RSq 2 - a n qi + qo 
\ -RMSq 2 - RS qi - a n go (*) / 

We assume that the coefficients q% , qi,qo are given by sums of clow sequences o n M , 
that is, by clow sequences on vertices {2, 3}. Using this assumption and equation (3.5), 
we show th at P3, P2,Pi, Po are given by clow sequences on A, just as in the statement 



of Theorem |3.11 

Since 32 = 1, Pa = 1 as well. Note that q 2 = 1 is consistent with our statement 
that it is the sum of restricted clow sequences of length zero, since the re i s only one 
empty clow sequence, and its weight is by convention 1 (see Definition 3.9). 

Consider p 2 , which by definition is supposed to be the sum of clow sequences of 
length one on all three vertices. This is the sum of clow sequences of length one on 
vertices 2 and 3 (i.e., gi), plus the clow sequence consisting of a single selfdoop on 
vertex 1 with weight an and sign (— l ) 1 = —1 (sec Definition ( |3.5| )). Hence, the sum 
is indeed —011^2 + <Zi ; as in equation (3.5) (again, 02 = 1). 

Consider p±. Since p\ — P3-2, Pi is the sum of clow sequences of length two. 
We are going to show that the term —RSq 2 — a\\q\ + qo is equal to the sum of clow 
sequences of length 2 on A. First note that there is just one clow of length two on 
vertices 2 and 3, and it is given by go- There are two clows of length two which include 
a self loop at vertex 1. These clows correspond to the term —a\%q\. Note that the 
negative sign comes from the fact that q\ has a negative value, but there are two clows 
per sequence, so the parity is even, according to Definition ( |3.5| ) . Finally, we consider 
the clow sequences of length two, where there is no self loop at vertex 1. Since vertex 
1 must be included, there are only two possibilities; these clows correspond to the 
term —RSq2 which is equal to: 



012 Oi3 



O21 
«31 



-(112021 — 013031 



since q2 = 1. 
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For po, the reader can add up all the clows by following Example (3.6). One thing 
to notice, when tracing this case, is that the summation indicated by (*) includes only 
those clow sequences which start at vertex 1. This is because, the bottom entry in 
equation ( |3.5| ), unlike the other entries, does not have a 1 in the last column, and 
hence there is not coefficient from the char poly of M appearing by itself. This is 
not a problem for the following reason: if vertex 1 is not included in a clow sequence 
computing the last entry, then that clow sequence will cancel out anyways, since a 
clow sequence of length 3 that avoids the first vertex, cannot be a cycle cover! This 
observation will be made more explicit in the proof below. 



Proof. (Theorem 3.11 ) We prove this theorem by induction on the size of matrices. 
The Basis Case is easy, since if A is a 1 x 1 matrix, then A = (a), so pa = ( 1 —a ) , 
so pi — 1, and po = — a which is (— l)x the sum of clow sequences of length 1. 

In the Induction Step, suppose that A is an (n + 1) X (n+ 1) matrix and: 
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\ Pa J \ -RM n - l S -RM n - 2 S -RM n ~ 3 S 



\ 



J 



( In \ 

?n-l 

q n -2 



(3.6) 



By the Induction Hypothesis, qui — ( q n q n -i ■ ■ ■ <7o ) satisfies the statement of 
the theorem for M = A[l\l), that is, q n -i is equal to the sum of clow sequences of 
length i on M = A[l\l]. 

Since p n+ i = q n , Pn+i = 1. Since p n = -auq n + q n -i = -an + Qn-i (as q n = 1), 
using the fact that q n -i = the sum of clow sequences of length 1 on M, it follows 
that p n — the sum of clow sequences of length 1 on A. 

Now we prove this for general n+ 1 > i > 1, that is, we prove that p„+i-i is the 
sum of clow sequences of length i on A. Note that: 



Pn+l-i 



-RM l ~ 2 Sq n - RM % -*Sq n -i 



RSq, 



n+2-i 



anq n+ i-i + q n -i (3.7) 



as can be seen by inspection from equation (3.6). Observe that the («, j)-th entry of 
M k is the sum of clows in M that start at vertex i and end at vertex j of length k, 
and therefore, RM k S is the sum of clows in A that start at vertex 1 (and of course 
end at vertex 1, and vertex 1 is never visited otherwise), of length k + 2. 

Therefore, RM l ~ 2 ~ : > Sq n -j, for j = 0, . . . , i — 2, is the product of the sum of clows 
of length i—j (that start and end at vertex 1) and the sum of clow sequences of length 
j on M (by the Induction Hypothesis), which is just the sum of clow sequences of 
length i where the first clow starts and ends at vertex 1, and has length i—j- Each 
clow sequence of length i on A starts off with a clow anchored at the first vertex, and 

corresponds to the case where 
contributes the 



-o-nqn+i- 



the second to last term of equation (3.7) 

the first clow is just a self loop. Finally, the last term given by q 
clow sequences of length i which do not include the first vertex. 
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The last case is when 



1, so po, which is the determinant of A, by The- 



orem 3.1C. As was mentioned at the end of Example 3.12, this is a special sum of 
clow sequences, because the head of the first clow is always vertex 1. Here is when we 



invoke the proof of the Theorem 3.10: the last entry, po can be shown to be the sum 



of clow sequences, where the head of the first clow is always vertex 1, by following 
an argument analogous to the one in the above paragraph. However, this sum is still 
equal in value to the sum of all clow sequences (of length n + 1). This is because, if 
we consider clow sequences of length n + 1, and there are n + 1 vertices, and we get 
a clow sequences C which avoids the first vertex, then we know that C cannot be a 
cycle cover, and therefore it will cancel out in the summation anyways, just as it was 



shown to happen in the proof of Theorem 3.10. □ 
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